\chapter{Inputs to Dakota}\label{input}

\section{Overview of Inputs}\label{input:overview}

Dakota supports a number of command-line arguments, as described in
Section~\ref{tutorial:installation:running}.  Among these are
specifications for the Dakota input file and, optionally, a restart
file.  The syntax of the Dakota input file is described in detail in
the Dakota Reference Manual~\cite{RefMan}, and the restart file is
described in Chapter~\ref{restart}.

A Dakota input file may be prepared manually with a text editor such
as Emacs, Vi, or WordPad, or it may be defined with the JAGUAR Dakota
graphical user interface.  The JAGUAR Dakota GUI is built on the
Java-based Eclipse Framework \cite{Eclipse} and presents the Dakota
input specification options in synchronized text-editing and graphical
views.  JAGUAR includes templates and wizards for helping create
Dakota studies and can invoke Dakota to run an analysis.  The Dakota
GUI for Linux, Windows, and Mac, is available for download from the
Dakota website \url{http://dakota.sandia.gov/}, along with licensing
information, separate GUI documentation, and installation tips.

\subsection{Tabular Data Formats}\label{input:tabularformat}

The Dakota input file and/or command line may identify additional
files for data import as described in Section~\ref{input:import}.
Some of these files are in tabular data format.

Dakota versions 5.1+ (October 2011) and newer use two formats for
tabular data file input and output.  Tabular data refer to numeric
data in text form related to, e.g., tabular graphics data, least
squares and Bayesian calibration data, samples/points files for
constructing surrogates, pre-run output, and post-run input.  Both
formats are written/read with C++ stream operators/conversions, so
most integer and floating point formats are acceptable for numeric
data.  The formats are:

\begin{itemize}
  \item {\bf Annotated Matrix} (default for all I/O; specified via
    {\tt annotated}): text file with one leading row of comments/column
    labels and one leading column of evaluation/row IDs surrounding
    num\_rows x num\_cols whitespace-separated numeric data, (newlines
    separating rows are not currently required, but may be in the
    future).  The numeric data in a row may correspond to variables,
    variables followed by responses, data point for calibration, etc.,
    depending on context.

   \item {\bf Free-form Matrix} (optional; previously default for
     samples files and least squares data; specified via {\tt
       freeform}): text file with no leading row and no leading
     column.  The num\_rows x num\_cols total numeric data entries may
     appear separated with any whitespace including arbitrary spaces,
     tabs, and newlines.  In this format, vectors may therefore appear
     as a single row or single column (or mixture; entries will
     populate the vector in order).
\end{itemize}

{\bf Attention:} Prior to October 2011, calibration and surrogate data
files were free-form format.  They now default to annotated format,
though {\tt freeform} remains an option.  For both formats, a warning
will be generated if a specific number of data are expected, but extra
is found and an error generated when there is insufficient data.  Some
TPLs like SCOLIB and JEGA manage their own file I/O and only support
the free-form option.

\section{Data Imports}\label{input:import}

The Dakota input file and/or command line may identify additional
files used to import data into Dakota.

\subsection{AMPL algebraic mappings: stub.nl, stub.row, and stub.col}

As described in Section~\ref{interfaces:algebraic}, an AMPL
specification of algebraic input-to-output relationships may be
imported into Dakota and used to define or augment the mappings of a
particular interface.

\subsection{Genetic algorithm population import}

Genetic algorithms (GAs) from the JEGA and SCOLIB packages support a
population import feature using the keywords
\texttt{initialization\_type flat\_file = \emph{STRING}}.  This is
useful for warm starting GAs from available data or previous runs.
Refer to the Method Specification chapter in the Dakota Reference
Manual~\cite{RefMan} for additional information on this specification.
The flat file must be in free-form format as described in
Section~\ref{input:tabularformat}.

\subsection{Calibration (least squares) data import}

Deterministic least squares and Bayesian Calibration methods allow
specification of experimental observations to difference with
responses in calculating a model misfit metric (such as a sum of
squared residuals).  The default file format is {\tt annotated}, but
the {\tt freeform} option is supported.  The data file is comprised of
one or more experiments, each with one or more replicates, one row per
replicate.  The columns of the data file contain, in sequence
\begin{itemize}
\item {\bf configuration variables (optional):} state variable values
  indicating the configuration at which this experiment was conducted;
  length must agree with thee number of state variables active in the
  study.
\item {\bf experimental observations (required):} experimental data
  values to difference with model responses; length number of
  responses.
\item {\bf experimental standard deviations (optional):} measurement
  errors (standard deviations) associated with the data; length 1
  (same value for each model response) or num\_responses.
\end{itemize}
See~\ref{nls:examples} and the Dakota Reference Manual Responses
section for further details.

\subsection{PCE coefficient import}

Polynomial chaos expansion (PCE) methods compute coefficients for
response expansions which employ a basis of multivariate orthogonal
polynomials.  Normally, the \texttt{polynomial\_chaos} method
calculates these coefficients based either on a spectral projection or
a linear regression (see Section~\ref{uq:expansion}).  However,
Dakota also supports the option of importing a set of response PCE
coefficients based on the specification \\
\texttt{expansion\_import\_file = \emph{STRING}}.  This is useful for
evaluating moments analytically or computing probabilities numerically
from a known response expansion.  Refer to the Method Specification
chapter in the Dakota Reference Manual~\cite{RefMan} for additional
information on this specification.

\subsection{Surrogate construction from data files}

Global data fit surrogates may be constructed from a variety of data
sources.  One of these sources is an auxiliary data file, as specified
by the keywords \texttt{reuse\_samples points\_file = \emph{STRING}}.
The file may be in annotated (default) or free-form format with
columns corresponding to variables and responses.  Refer to the Model
Specification chapter in the Dakota Reference Manual~\cite{RefMan} for
additional information on this specification.

\subsection{Variables/responses import to post-run}

The post-run mode (supported only for sampling, parameter study, and
DACE methods) requires specification of a file containing parameter
and response data in annotated tabular format (see Section
~\ref{input:tabularformat}; free-form is not supported).  An
evaluation ID column is followed by columns for variables, then those
for responses, with an ignored header row of labels and then one row
per evaluation.  Typically this file would be generated by executing
\texttt{dakota -i dakota.in -pre\_run ::variables.dat} and then adding
columns of response data to variables.dat to make varsresponses.dat.
The file is specified at the command line with:
\begin{small}
\begin{verbatim}
    dakota -i dakota.in -post_run varsresponses.dat::
\end{verbatim}
\end{small}

%PCE coefficients
